Extractive Multi-Document Summarization with Integer Linear Programming and Support Vector Regression
نویسندگان
چکیده
We present a new method to generate extractive multi-document summaries. The method uses Integer Linear Programming to jointly maximize the importance of the sentences it includes in the summary and their diversity, without exceeding a maximum allowed summary length. To obtain an importance score for each sentence, it uses a Support Vector Regression model trained on human-authored summaries, whereas the diversity of the selected sentences is measured as the number of distinct word bigrams in the resulting summary. Experimental results on widely used benchmarks show that our method achieves state of the art results, when compared to competitive extractive summarizers, while being computationally efficient as well.
منابع مشابه
Fast and Robust Compressive Summarization with Dual Decomposition and Multi-Task Learning
We present a dual decomposition framework for multi-document summarization, using a model that jointly extracts and compresses sentences. Compared with previous work based on integer linear programming, our approach does not require external solvers, is significantly faster, and is modular in the three qualities a summary should have: conciseness, informativeness, and grammaticality. In additio...
متن کاملMulti-document Summarization Using Support Vector Regression
Most multi-document summarization systems follow the extractive framework based on various features. While more and more sophisticated features are designed, the reasonable combination of features becomes a challenge. Usually the features are combined by a linear function whose weights are tuned manually. In this task, Support Vector Regression (SVR) model is used for automatically combining th...
متن کاملTGSum: Build Tweet Guided Multi-Document Summarization Dataset
The development of summarization research has been significantly hampered by the costly acquisition of reference summaries. This paper proposes an effective way to automatically collect large scales of news-related multi-document summaries with reference to social media’s reactions. We utilize two types of social labels in tweets, i.e., hashtags and hyper-links. Hashtags are used to cluster doc...
متن کاملJoint Optimization of User-desired Content in Multi-document Summaries by Learning from User Feedback
In this paper, we propose an extractive multi-document summarization (MDS) system using joint optimization and active learning for content selection grounded in user feedback. Our method interactively obtains user feedback to gradually improve the results of a state-of-the-art integer linear programming (ILP) framework for MDS. Our methods complement fully automatic methods in producing highqua...
متن کاملMulti-Document Abstractive Summarization Using ILP Based Multi-Sentence Compression
Abstractive summarization is an ideal form of summarization since it can synthesize information from multiple documents to create concise informative summaries. In this work, we aim at developing an abstractive summarizer. First, our proposed approach identifies the most important document in the multi-document set. The sentences in the most important document are aligned to sentences in other ...
متن کامل